Febrl – A Freely Available Record Linkage System with a Graphical User Interface
نویسنده
چکیده
Record or data linkage is an important enabling technology in the health sector, as linked data is a costeffective resource that can help to improve research into health policies, detect adverse drug reactions, reduce costs, and uncover fraud within the health system. Significant advances, mostly originating from data mining and machine learning, have been made in recent years in many areas of record linkage techniques. Most of these new methods are not yet implemented in current record linkage systems, or are hidden within ‘black box’ commercial software. This makes it difficult for users to learn about new record linkage techniques, as well as to compare existing linkage techniques with new ones. What is required are flexible tools that enable users to experiment with new record linkage techniques at low costs. This paper describes the Febrl (Freely Extensible Biomedical Record Linkage) system, which is available under an open source software licence. It contains many recently developed advanced techniques for data cleaning and standardisation, indexing (blocking), field comparison, and record pair classification, and encapsulates them into a graphical user interface. Febrl can be seen as a training tool suitable for users to learn and experiment with both traditional and new record linkage techniques, as well as for practitioners to conduct linkages with data sets containing up to several hundred thousand records.
منابع مشابه
A Probabilistic Deduplication, Record Linkage and Geocoding System
In many data mining projects in the health sector information from multiple data sources needs to be cleaned, deduplicated and linked in order to allow more detailed analysis. The aim of such linkages is to merge all records relating to the same entity, such as a patient. Most of the time the linkage process is challenged by the lack of a common unique entity identifier. Additionally, personal ...
متن کاملAn Efficient way of Record Linkage System and Deduplication using Indexing techniques, Classification and FEBRL Framework
Record linkage is an important process in data integration, which is used in merging, matching and duplicate removal from several databases that refer to the same entities. Deduplication is the process of removing duplicate records in a single database. In recent years, data cleaning and standardization becomes an important process in data mining task. Due to complexity of today’s database, fin...
متن کاملSNPLINK: multipoint linkage analysis of densely distributed SNP data incorporating automated linkage disequilibrium removal
SUMMARY SNPLINK is a Perl script that performs full genome linkage analysis of high-density single nucleotide polymorphism (SNP) marker sets. The presence of linkage disequilibrium (LD) between closely spaced SNP markers can falsely inflate linkage statistics. SNPLINK removes LD from the marker sets in an automated fashion before carrying out linkage analysis. SNPLINK can compute both parametri...
متن کاملتحلیل میزان درک کاربران از نمادهای تصویری محیط رابط گرافیکی نرمافزار سیمرغ
Purpose: This research is devoted to study the icons in graphical user interface of Simorgh library software and analyze the users’ understanding of and interaction with this software in Birjand University. Methodology: The methodology of this research is of survey type and it is an applied study. To measure the responders’ understanding of icons in different pages of search section in Simorgh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007